智能论文笔记

On-device Synaptic Memory Consolidation using Fowler-Nordheim Quantum-tunneling

Mustafizur Rahman , Subhankar Bose , Shantanu Chakrabartty

分类：人工智能 | 计算机视觉 | 机器学习

2022-06-27

突触记忆巩固已被认为是支持神经形态人工智能（AI）系统中持续学习的关键机制之一。在这里，我们报告说，Fowler-Nordheim（FN）量子隧道设备可以实现突触存储器巩固，类似于通过算法合并模型（例如级联和弹性重量合并（EWC）模型）所能实现的。拟议的FN-Synapse不仅存储突触重量，而且还存储了Synapse在设备本身上的历史用法统计量。我们还表明，就突触寿命而言，FN合并的操作几乎是最佳的，并且我们证明了一个包含FN合成的网络在一个小基准测试持续学习任务上超过了可比的EWC网络。通过每次突触更新的Femtojoules的能量足迹，我们相信所提出的FN-Synapse为实施突触记忆巩固和持续学习提供了一种超能效率的方法。

translated by 谷歌翻译

Process, Bias and Temperature Scalable CMOS Analog Computing Circuits for Machine Learning

Pratik Kumar , Ankita Nandi , Shantanu Chakrabartty , Chetan Singh Thakur

分类：机器学习

2022-05-11

与数字计算相比，模拟计算具有吸引力，因为它可以达到更高的计算密度和更高的能源效率。但是，与数字电路不同，由于晶体管偏置偏差，温度变化和有限的动态范围的差异，传统的模拟计算电路不能轻易地在不同的过程节点上映射。在这项工作中，我们概括了先前报道的基于边缘传播的模拟计算框架，用于设计新颖的\ textit {基于形状的模拟计算}（S-AC）电路，这些电路可以轻松地在不同的过程节点上交叉映射。与数字设计类似的S-AC设计也可以缩放以获得精确，速度和功率。作为概念验证，我们展示了实现机器学习（ML）体系结构中通常使用的数学功能的S-AC电路的几个示例。使用电路模拟，我们证明了电路输入/输出特性从平面CMOS 180NM工艺映射到FinFET 7NM工艺时保持健壮。同样，使用基准数据集，我们证明了基于S-AC的神经网络的分类精度在两个过程中映射到温度变化时仍然坚固。

translated by 谷歌翻译

Bias-Scalable Near-Memory CMOS Analog Processor for Machine Learning

Pratik Kumar , Ankita Nandi , Shantanu Chakrabartty , Chetan Singh Thakur

分类：人工智能 | 机器学习

2022-02-10

偏差可估算的模拟计算对于实施机器学习（ML）处理器具有不同的功能性能规格具有吸引力。例如，用于服务器工作负载的ML实现专注于计算吞吐量和更快的训练，而Edge设备的ML实现则集中在节能推理上。在本文中，我们证明了使用边缘传播（MP）原理的概括（MP）原理称为基于形状的模拟计算（S-AC）的偏置模拟计算电路的实现。所得的S-AC核心集成了几个接近内存的计算元素，其中包括：（a）非线性激活函数；（b）内部产品计算电路；（c）混合信号压缩内存。使用在180nm CMOS工艺中制造的原型的测量结果，我们证明了计算模块的性能仍然可与晶体管偏置和温度变化保持稳健。在本文中，我们还证明了简单的ML回归任务的偏差量表性。

translated by 谷歌翻译

Multiplierless MP-Kernel Machine For Energy-efficient Edge Devices

Abhishek Ramdas Nair , Pallab Kumar Nath , Shantanu Chakrabartty , Chetan Singh Thakur

分类：机器学习 | 人工智能 | 神经与进化计算

2021-06-03

我们提出了一个新颖的框架，用于设计无乘数内核机器，该机器可以在智能边缘设备等资源约束平台上使用。该框架使用基于边缘传播（MP）技术的分段线性（PWL）近似值，仅使用加法/减法，移位，比较和寄存器底流/溢出操作。我们建议使用针对现场可编程门阵列（FPGA）平台进行优化的基于硬件的MP推理和在线培训算法。我们的FPGA实施消除了对DSP单元的需求，并减少了LUT的数量。通过重复使用相同的硬件进行推理和培训，我们表明该平台可以克服由MP近似产生的分类错误和本地最小值。该提议的无乘数MP-Kernel机器在FPGA上的实施导致估计的能源消耗为13.4 PJ，功率消耗为107 MW，每台均具有〜9K LUTS和FFS，每张均具有256 x 32个大小的核与其他可比实现相比，区域和区域。

translated by 谷歌翻译

A Twitter BERT Approach for Offensive Language Detection in Marathi

Tanmay Chavan , Shantanu Patankar , Aditya Kane , Omkar Gokhale , Raviraj Joshi

分类：自然语言处理

2022-12-20

Automated offensive language detection is essential in combating the spread of hate speech, particularly in social media. This paper describes our work on Offensive Language Identification in low resource Indic language Marathi. The problem is formulated as a text classification task to identify a tweet as offensive or non-offensive. We evaluate different mono-lingual and multi-lingual BERT models on this classification task, focusing on BERT models pre-trained with social media datasets. We compare the performance of MuRIL, MahaTweetBERT, MahaTweetBERT-Hateful, and MahaBERT on the HASOC 2022 test set. We also explore external data augmentation from other existing Marathi hate speech corpus HASOC 2021 and L3Cube-MahaHate. The MahaTweetBERT, a BERT model, pre-trained on Marathi tweets when fine-tuned on the combined dataset (HASOC 2021 + HASOC 2022 + MahaHate), outperforms all models with an F1 score of 98.43 on the HASOC 2022 test set. With this, we also provide a new state-of-the-art result on HASOC 2022 / MOLD v2 test set.

translated by 谷歌翻译

Leveraging Structure for Improved Classification of Grouped Biased Data

Daniel Zeiberg , Shantanu Jain , Predrag Radivojac

分类： (统计)机器学习 | 机器学习

2022-12-07

We consider semi-supervised binary classification for applications in which data points are naturally grouped (e.g., survey responses grouped by state) and the labeled data is biased (e.g., survey respondents are not representative of the population). The groups overlap in the feature space and consequently the input-output patterns are related across the groups. To model the inherent structure in such data, we assume the partition-projected class-conditional invariance across groups, defined in terms of the group-agnostic feature space. We demonstrate that under this assumption, the group carries additional information about the class, over the group-agnostic features, with provably improved area under the ROC curve. Further assuming invariance of partition-projected class-conditional distributions across both labeled and unlabeled data, we derive a semi-supervised algorithm that explicitly leverages the structure to learn an optimal, group-aware, probability-calibrated classifier, despite the bias in the labeled data. Experiments on synthetic and real data demonstrate the efficacy of our algorithm over suitable baselines and ablative models, spanning standard supervised and semi-supervised learning approaches, with and without incorporating the group directly as a feature.

translated by 谷歌翻译

Understanding Self-Predictive Learning for Reinforcement Learning

Yunhao Tang , Zhaohan Daniel Guo , Pierre Harvey Richemond , Bernardo Ávila Pires , Yash Chandak , Rémi Munos , Mark Rowland , Mohammad Gheshlaghi Azar , Charline Le Lan , Clare Lyle

分类：机器学习 | 人工智能

2022-12-06

We study the learning dynamics of self-predictive learning for reinforcement learning, a family of algorithms that learn representations by minimizing the prediction error of their own future latent representations. Despite its recent empirical success, such algorithms have an apparent defect: trivial representations (such as constants) minimize the prediction error, yet it is obviously undesirable to converge to such solutions. Our central insight is that careful designs of the optimization dynamics are critical to learning meaningful representations. We identify that a faster paced optimization of the predictor and semi-gradient updates on the representation, are crucial to preventing the representation collapse. Then in an idealized setup, we show self-predictive learning dynamics carries out spectral decomposition on the state transition matrix, effectively capturing information of the transition dynamics. Building on the theoretical insights, we propose bidirectional self-predictive learning, a novel self-predictive algorithm that learns two representations simultaneously. We examine the robustness of our theoretical insights with a number of small-scale experiments and showcase the promise of the novel representation learning algorithm with large-scale experiments.

translated by 谷歌翻译

A Probabilistic-Logic based Commonsense Representation Framework for Modelling Inferences with Multiple Antecedents and Varying Likelihoods

Shantanu Jaiswal , Liu Yan , Dongkyu Choi , Kenneth Kwok

分类：自然语言处理

2022-11-30

Commonsense knowledge-graphs (CKGs) are important resources towards building machines that can 'reason' on text or environmental inputs and make inferences beyond perception. While current CKGs encode world knowledge for a large number of concepts and have been effectively utilized for incorporating commonsense in neural models, they primarily encode declarative or single-condition inferential knowledge and assume all conceptual beliefs to have the same likelihood. Further, these CKGs utilize a limited set of relations shared across concepts and lack a coherent knowledge organization structure resulting in redundancies as well as sparsity across the larger knowledge graph. Consequently, today's CKGs, while useful for a first level of reasoning, do not adequately capture deeper human-level commonsense inferences which can be more nuanced and influenced by multiple contextual or situational factors. Accordingly, in this work, we study how commonsense knowledge can be better represented by -- (i) utilizing a probabilistic logic representation scheme to model composite inferential knowledge and represent conceptual beliefs with varying likelihoods, and (ii) incorporating a hierarchical conceptual ontology to identify salient concept-relevant relations and organize beliefs at different conceptual levels. Our resulting knowledge representation framework can encode a wider variety of world knowledge and represent beliefs flexibly using grounded concepts as well as free-text phrases. As a result, the framework can be utilized as both a traditional free-text knowledge graph and a grounded logic-based inference system more suitable for neuro-symbolic applications. We describe how we extend the PrimeNet knowledge base with our framework through crowd-sourcing and expert-annotation, and demonstrate its application for more interpretable passage-based semantic parsing and question answering.

translated by 谷歌翻译

Optimization of side lobe level of linear antenna array using nature optimized ants bridging solutions(NOABS)

Sunit Shantanu Digamber Fulari

分类：神经与进化计算

2022-10-16

Nature inspired algorithms has brought solutions to complex problems in optimization where the optimization and solution of complex problems is highly complex and nonlinear. There is a need to use proper design of the cost function or the fitness function in terms of the parameters to be optimized, this can be used in solving any type of such problems. In this paper the nature inspired algorithms has played important role in the optimal design of antenna array with improved radiation characteristics. In this paper, 20 elements linearly spaced array is used as an example of nature inspired optimization in antenna array system. This bridge inspired army ant algorithm(NOABS) is used to reduce the side lobes and to improve the other radiation characteristics to show the effect of the optimization on design characteristics by implementation of NOABS nature inspired algorithm. The entire simulation is carried out on 20 elements linear antenna array.

translated by 谷歌翻译

Spread Love Not Hate: Undermining the Importance of Hateful Pre-training for Hate Speech Detection

Omkar Gokhale , Aditya Kane , Shantanu Patankar , Tanmay Chavan , Raviraj Joshi

分类：自然语言处理 | 人工智能

2022-10-09

Pre-training large neural language models, such as BERT, has led to impressive gains on many natural language processing (NLP) tasks. Although this method has proven to be effective for many domains, it might not always provide desirable benefits. In this paper, we study the effects of hateful pre-training on low-resource hate speech classification tasks. While previous studies on the English language have emphasized its importance, we aim to augment their observations with some non-obvious insights. We evaluate different variations of tweet-based BERT models pre-trained on hateful, non-hateful, and mixed subsets of a 40M tweet dataset. This evaluation is carried out for the Indian languages Hindi and Marathi. This paper is empirical evidence that hateful pre-training is not the best pre-training option for hate speech detection. We show that pre-training on non-hateful text from the target domain provides similar or better results. Further, we introduce HindTweetBERT and MahaTweetBERT, the first publicly available BERT models pre-trained on Hindi and Marathi tweets, respectively. We show that they provide state-of-the-art performance on hate speech classification tasks. We also release hateful BERT for the two languages and a gold hate speech evaluation benchmark HateEval-Hi and HateEval-Mr consisting of manually labeled 2000 tweets each. The models and data are available at https://github.com/l3cube-pune/MarathiNLP .

translated by 谷歌翻译